This tutorial comes from Carson Sievert’s Plotly for R Master Class.
The plotly package depends on ggplot2 which bundles a data set on monthly housing sales in Texan cities acquired from the TAMU real estate center. After the loading the package, the data is “lazily loaded”" into your session, so you may reference it by name:
## # A tibble: 8,602 x 9
## city year month sales volume median listings inventory date
## <chr> <int> <int> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 Abilene 2000 1 72 5380000 71400 701 6.3 2000
## 2 Abilene 2000 2 98 6505000 58700 746 6.6 2000.
## 3 Abilene 2000 3 130 9285000 58100 784 6.8 2000.
## 4 Abilene 2000 4 98 9730000 68600 785 6.9 2000.
## 5 Abilene 2000 5 141 10590000 67300 794 6.8 2000.
## 6 Abilene 2000 6 156 13910000 66900 780 6.6 2000.
## 7 Abilene 2000 7 152 12635000 73500 742 6.2 2000.
## 8 Abilene 2000 8 131 10710000 75000 765 6.4 2001.
## 9 Abilene 2000 9 104 7615000 64500 771 6.5 2001.
## 10 Abilene 2000 10 101 7040000 59300 764 6.6 2001.
## # … with 8,592 more rows
Let’s see if there’s any pattern in house price behavior over time:
p <- ggplot(txhousing, aes(x = date, y = median)) +
geom_line(aes(group = city), alpha = 0.2)
class(p)## [1] "gg" "ggplot"
It’s be nice if we could see which city each line corresponds to when we hover. plotly makes this easy! Just wrap your ggplot object in the ggplotly() function:
We can also build plotly objects directly using the plot_ly() function along with dplyr-like syntax. Why would we want to? Well, for one thing, plot_ly() recognizes and preserves groupings created with dplyr’s group_by() function:
library(dplyr)
tx <- group_by(txhousing, city)
# initiate a plotly object with date on x and median on y
p <- plot_ly(tx, x = ~date, y = ~median)
# plotly_data() returns data associated with a plotly object
plotly_data(p)## # A tibble: 8,602 x 9
## city year month sales volume median listings inventory date
## * <chr> <int> <int> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 Abilene 2000 1 72 5380000 71400 701 6.3 2000
## 2 Abilene 2000 2 98 6505000 58700 746 6.6 2000.
## 3 Abilene 2000 3 130 9285000 58100 784 6.8 2000.
## 4 Abilene 2000 4 98 9730000 68600 785 6.9 2000.
## 5 Abilene 2000 5 141 10590000 67300 794 6.8 2000.
## 6 Abilene 2000 6 156 13910000 66900 780 6.6 2000.
## 7 Abilene 2000 7 152 12635000 73500 742 6.2 2000.
## 8 Abilene 2000 8 131 10710000 75000 765 6.4 2001.
## 9 Abilene 2000 9 104 7615000 64500 771 6.5 2001.
## 10 Abilene 2000 10 101 7040000 59300 764 6.6 2001.
## # … with 8,592 more rows
Since we didn’t specify any mapping, the plot defaults to a scatterplot:
## No trace type specified:
## Based on info supplied, a 'scatter' trace seems appropriate.
## Read more about this trace type -> https://plot.ly/r/reference/#scatter
## No scatter mode specifed:
## Setting the mode to markers
## Read more about this attribute -> https://plot.ly/r/reference/#scatter-mode
## Warning: Ignoring 616 observations
Let’s change that to a line chart. Similar to geom_line() in ggplot2, the add_lines() function connects (a group of) x/y pairs with lines in the order of their x values and returns the transformed plotly object:
Want to highlight a particular line? Filtering works, and since each add_lines() call returns a pointer to the modified plotly object, we can chain calls together with pipes:
p <- txhousing %>%
group_by(city) %>%
plot_ly(x = ~date, y = ~median) %>%
add_lines(alpha = 0.2, name = "All TX Cities", hoverinfo = "none") %>%
filter(city == "Houston") %>%
add_lines(name = "Houston")
prangeslider():
And just so you don’t think we’re limited to line charts:
## Warning: Removed 616 rows containing non-finite values (stat_bin2d).
Check out The Plotly Cookbook for more details on specific plotly visualization types (“traces”).
htmlwidgetsSince plotly objects inherit properties from an htmlwidget object, any method that works for arranging htmlwidgets also works for plotly objects. In some sense, an htmlwidget object is just a collection of HTML tags, and the htmltools package provides some useful functions for working with HTML tags RStudio and Inc. 2016. The tagList() function gathers multiple HTML tags into a tag list, and when printing a tag list inside of a knitr/rmarkdown document Xie 2016; Allaire et al. 2016, it knows to render as HTML:
htmlwidget object is an HTML <div> tag. More often than not, it is desirable to arrange multiple plots in a given row, and there are a few ways to do that. A very flexible approach is to wrap all of your plots in a flexbox (i.e., an HTML <div> with display: flex Cascading Style Sheets (CSS) property). The tags$div() function from htmltools provides a way to wrap a <div> around both tag lists and htmlwidget objects, and set attributes, such as style.
tags$div(
style = "display: flex; flex-wrap: wrap",
tags$div(p, style = "width: 45%; padding: 1em;"),
tags$div(p2, style = "width: 45%; padding: 1em;")
)shiny Another way to arrange multiple htmlwidget objects on a single page is to leverage the fluidPage(), fluidRow(), and column() functions from the shiny package:
subplot()We could also use plotly’s built-in subplot() function to generate a single plotly object with a common y-axis
## Warning: Can only have one: config
crosstalkcrosstalk is the R implementation of the powerful crossfilter JS library. Though this dataset isn’t very large, we’ll use it to create a SharedData object that allows us to propagate interaction.
library(crosstalk)
shared_data <- txhousing %>%
filter(city %in% c("Houston", "Dallas", "Galveston")) %>%
SharedData$new(~year)As far as ggplotly() and plot_ly() are concerned, SharedData object(s) act just like a data frame, but with a special key attribute attached to graphical elements. Since both interfaces are based on the layered grammar of graphics, key attributes can be attached at the layer level, and those attributes can also be shared across multiple views. Let’s leverage both of these features to link multiple views of median house sales in various Texan cities:
p <- ggplot(shared_data, aes(month, median)) +
geom_line(aes(group = year)) +
facet_wrap(~ city)
ggplotly(p, tooltip = "year") %>%
highlight(color = "red")## Setting the `off` event (i.e., 'plotly_doubleclick') to match the `on` event (i.e., 'plotly_click'). You can change this default via the `highlight()` function.
We can also link different views, like those from ggpairs. Let’s look at the iris dataset because it’s easy on the eyes:
shared_iris <- SharedData$new(iris)
p <- GGally::ggpairs(shared_iris, aes(color = Species), columns = 1:4)
highlight(ggplotly(p), on = "plotly_selected")## Warning: Can only have one: highlight
## Warning: Can only have one: highlight
## Warning: Can only have one: highlight
## Setting the `off` event (i.e., 'plotly_deselect') to match the `on` event (i.e., 'plotly_selected'). You can change this default via the `highlight()` function.
Really, we can link any collection of htmlwidgets:
library(leaflet)
shared_quakes <- SharedData$new(quakes)
p <- plot_ly(shared_quakes, x = ~depth, y = ~mag) %>%
add_markers(alpha = 0.5) %>%
highlight("plotly_selected", dynamic = TRUE)## Adding more colors to the selection color palette.
## Assuming "long" and "lat" are longitude and latitude, respectively
## Setting the `off` event (i.e., 'plotly_deselect') to match the `on` event (i.e., 'plotly_selected'). You can change this default via the `highlight()` function.
We’ll see more about leaflet maps this afternoon.
Try some of these ideas out on your own dataset and see what you can come up with!